Liverpool Bay
A Details of Data Augmentation with External Knowledge Resources 486 4 Enhance Relation Recognition: We enriched the relationships between objects parsed from the
The hyperparameters for training are detailed in Table 7. We perform the human evaluation on two of the four in-depth knowledge quality assessment metrics. V alidity ( "): whether the generated visual knowledge is valid to humans . Conformity ( "): whether the generated knowledge faithfully depicts the scenarios in the images . Our calculated average pairwise Cohen's Suppose you are looking at an image that contains the following subject and object entities: Subject list: [Insert the subject names here] Object list: [Insert the object names here] Please extract 5-10 condensed descriptions that describe the interactions and/or relations among those entities in the image.
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Europe > United Kingdom > Irish Sea > East Irish Sea > Liverpool Bay (0.04)
- North America > United States (0.14)
- Europe > United Kingdom > Irish Sea > East Irish Sea > Liverpool Bay (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > China > Hong Kong (0.04)
- Health & Medicine (0.68)
- Transportation (0.46)
- Leisure & Entertainment (0.46)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > North Dakota > Burke County (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- (5 more...)
- Leisure & Entertainment > Sports (1.00)
- Transportation (0.69)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- Asia > China (0.04)
- (7 more...)
- Leisure & Entertainment > Sports (1.00)
- Transportation (0.93)
- Information Technology (0.92)
- Government (0.68)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
MVA 2025 Small Multi-Object Tracking for Spotting Birds Challenge: Dataset, Methods, and Results
Kondo, Yuki, Ukita, Norimichi, Kanayama, Riku, Yoshida, Yuki, Yamaguchi, Takayuki, Yu, Xiang, Liang, Guang, Liu, Xinyao, Wang, Guan-Zhang, Chu, Wei-Ta, Chuang, Bing-Cheng, Lee, Jia-Hua, Kuo, Pin-Tseng, Chu, I-Hsuan, Hsiao, Yi-Shein, Wu, Cheng-Han, Wu, Po-Yi, Tsou, Jui-Chien, Liu, Hsuan-Chi, Lee, Chun-Yi, Yang, Yuan-Fu, Shigematsu, Kosuke, Shin, Asuka, Tran, Ba
Small Multi-Object Tracking (SMOT) is particularly challenging when targets occupy only a few dozen pixels, rendering detection and appearance-based association unreliable. Building on the success of the MVA2023 SOD4SB challenge, this paper introduces the SMOT4SB challenge, which leverages temporal information to address limitations of single-frame detection. Our three main contributions are: (1) the SMOT4SB dataset, consisting of 211 UAV video sequences with 108,192 annotated frames under diverse real-world conditions, designed to capture motion entanglement where both camera and targets move freely in 3D; (2) SO-HOTA, a novel metric combining Dot Distance with HOTA to mitigate the sensitivity of IoU-based metrics to small displacements; and (3) a competitive MVA2025 challenge with 78 participants and 308 submissions, where the winning method achieved a 5.1x improvement over the baseline. This work lays a foundation for advancing SMOT in UAV scenarios with applications in bird strike avoidance, agriculture, fisheries, and ecological monitoring.
- Asia > Taiwan (0.05)
- Asia > Japan > Honshū > Tōhoku > Iwate Prefecture (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Europe > United Kingdom > Irish Sea > East Irish Sea > Liverpool Bay (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Coupling Agent-Based Simulations and VR universes: the case of GAMA and Unity
Drogoul, Alexis, Taillandier, Patrick, Brugière, Arthur, Martinez, Louis, Sillano, Léon, Lesquoy, Baptiste, Nghi, Huynh Quang
Agent-based models (ABMs) and video games, including those taking advantage of virtual reality (VR), have undergone a remarkable parallel evolution, achieving impressive levels of complexity and sophistication. This paper argues that while ABMs prioritize scientific analysis and understanding and VR aims for immersive entertainment, they both simulate artificial worlds and can benefit from closer integration. Coupling both approaches indeed opens interesting possibilities for research and development in various fields, and in particular education, at the heart of the SIMPLE project, an EU-funded project on the development of digital tools for awareness raising on environmental issues. However, existing tools often present limitations, including technical complexity, limited functionalities, and lack of interoperability. To address these challenges, we introduce a novel framework for linking GAMA, a popular ABM platform, with Unity, a widely used game engine. This framework enables seamless data exchange, real-time visualization, and user interaction within VR environments, allowing researchers to leverage the strengths of both ABMs and VR for more impactful and engaging simulations. We demonstrate the capabilities of our framework through two prototypes built to highlight its potential in representing and interacting with complex socio-environmental system models. We conclude by emphasizing the importance of continued collaboration between the ABM and VR communities to develop robust, user-friendly tools, paving the way for a new era of collaborative research and immersive experiences in simulations.
- Asia > Vietnam > Hanoi > Hanoi (0.05)
- South America > Brazil (0.04)
- Asia > Vietnam > Hanoi > Hoàn Kiếm District, Hanoi (0.04)
- (9 more...)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Health & Medicine (1.00)
- Education (1.00)
Improving the perception of visual fiducial markers in the field using Adaptive Active Exposure Control
Ren, Ziang, Lensgraf, Samuel, Li, Alberto Quattrini
Accurate localization is fundamental for autonomous underwater vehicles (AUVs) to carry out precise tasks, such as manipulation and construction. Vision-based solutions using fiducial marker are promising, but extremely challenging underwater because of harsh lighting condition underwater. This paper introduces a gradient-based active camera exposure control method to tackle sharp lighting variations during image acquisition, which can establish better foundation for subsequent image enhancement procedures. Considering a typical scenario for underwater operations where visual tags are used, we proposed several experiments comparing our method with other state-of-the-art exposure control method including Active Exposure Control (AEC) and Gradient-based Exposure Control (GEC). Results show a significant improvement in the accuracy of robot localization. This method is an important component that can be used in visual-based state estimation pipeline to improve the overall localization accuracy.
- North America > United States > Connecticut (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > New Hampshire > Grafton County > Hanover (0.04)
- Europe > United Kingdom > Irish Sea > East Irish Sea > Liverpool Bay (0.04)
Small and Dim Target Detection in IR Imagery: A Review
Kumar, Nikhil, Singh, Pravendra
While there has been significant progress in object detection using conventional image processing and machine learning algorithms, exploring small and dim target detection in the IR domain is a relatively new area of study. The majority of small and dim target detection methods are derived from conventional object detection algorithms, albeit with some alterations. The task of detecting small and dim targets in IR imagery is complex. This is because these targets often need distinct features, the background is cluttered with unclear details, and the IR signatures of the scene can change over time due to fluctuations in thermodynamics. The primary objective of this review is to highlight the progress made in this field. This is the first review in the field of small and dim target detection in infrared imagery, encompassing various methodologies ranging from conventional image processing to cutting-edge deep learning-based approaches. The authors have also introduced a taxonomy of such approaches. There are two main types of approaches: methodologies using several frames for detection, and single-frame-based detection techniques. Single frame-based detection techniques encompass a diverse range of methods, spanning from traditional image processing-based approaches to more advanced deep learning methodologies. Our findings indicate that deep learning approaches perform better than traditional image processing-based approaches. In addition, a comprehensive compilation of various available datasets has also been provided. Furthermore, this review identifies the gaps and limitations in existing techniques, paving the way for future research and development in this area.
- Asia > India > Uttarakhand > Roorkee (0.04)
- Asia > India > Uttarakhand > Dehradun (0.04)
- North America > United States > New York (0.04)
- (7 more...)
- Overview (1.00)
- Research Report > New Finding (0.65)
- Health & Medicine (0.68)
- Information Technology (0.67)
- Energy (0.46)
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
Cui, Hejie, Fang, Xinyu, Zhang, Zihan, Xu, Ran, Kan, Xuan, Liu, Xin, Yu, Yue, Li, Manling, Song, Yangqiu, Yang, Carl
Images contain rich relational knowledge that can help machines understand the world. Existing methods on visual knowledge extraction often rely on the pre-defined format (e.g., sub-verb-obj tuples) or vocabulary (e.g., relation types), restricting the expressiveness of the extracted knowledge. In this work, we take a first exploration to a new paradigm of open visual knowledge extraction. To achieve this, we present OpenVik which consists of an open relational region detector to detect regions potentially containing relational knowledge and a visual knowledge generator that generates format-free knowledge by prompting the large multimodality model with the detected region of interest. We also explore two data enhancement techniques for diversifying the generated format-free visual knowledge. Extensive knowledge quality evaluations highlight the correctness and uniqueness of the extracted open visual knowledge by OpenVik. Moreover, integrating our extracted knowledge across various visual reasoning applications shows consistent improvements, indicating the real-world applicability of OpenVik.
- North America > United States (0.14)
- Europe > United Kingdom > Irish Sea > East Irish Sea > Liverpool Bay (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > China > Hong Kong (0.04)
- Health & Medicine (0.68)
- Transportation (0.46)
- Leisure & Entertainment (0.46)
Adaptive and Collaborative Bathymetric Channel-Finding Approach for Multiple Autonomous Marine Vehicles
Gershfeld, Nikolai, Paine, Tyler M, Benjamin, Michael R.
This paper reports an investigation into the problem of rapid identification of a channel that crosses a body of water using one or more Unmanned Surface Vehicles (USV). A new algorithm called Proposal Based Adaptive Channel Search (PBACS) is presented as a potential solution that improves upon current methods. The empirical performance of PBACS is compared to lawnmower surveying and to Markov decision process (MDP) planning with two state-of-the-art reward functions: Upper Confidence Bound (UCB) and Maximum Value Information (MVI). The performance of each method is evaluated through comparison of the time it takes to identify a continuous channel through an area, using one, two, three, or four USVs. The performance of each method is compared across ten simulated bathymetry scenarios and one field area, each with different channel layouts. The results from simulations and field trials indicate that on average multi-vehicle PBACS outperforms lawnmower, UCB, and MVI based methods, especially when at least three vehicles are used.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.28)
- North America > United States > New York (0.04)
- North America > United States > Massachusetts > Barnstable County > Falmouth > Woods Hole (0.04)
- (3 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.55)
- Government > Regional Government > North America Government > United States Government (0.46)
- Government > Military (0.46)
- Transportation > Marine (0.41)